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Abstract 

We consider the problem of finding optimally stable polynomial approximations to the exponential 
for application to one-step integration of initial value ordinary and partial differential equations. The 
objective is to find the largest stable step size and corresponding method for a given problem when the 
spectrum of the initial value problem is known. The problem is expressed in terms of a general least 
deviation feasibility problem. Its solution is obtained by a new fast, accurate, and robust algorithm 
based on convex optimization techniques. Global convergence of the algorithm is proven in the case that 
the order of approximation is one and in the case that the spectrum encloses a starlike region. Examples 
demonstrate the effectiveness of the proposed algorithm even when these conditions are not satisfied. 

1 Stability of Runge— Kutta methods 

Runge-Kutta methods are among the most widely used types of numerical integrators for solving initial 
value ordinary and partial differential equations. The time step size should be taken as large as possible 
since the cost of solving an initial value problem (IVP) up to a fixed final time is proportional to the number 
of steps that must be taken. In practical computation, the time step is often limited by stability and accuracy 
constraints. Either accuracy, stability, or both may be limiting factors for a given problem; see e.g. [Ml 
Section 7.5] for a discussion. The linear stability and accuracy of an explicit Runge-Kutta method are 
characterized completely by the so-called stability polynomial of the method, which in turn dictates the 
acceptable step size pj [T3] . In this work we present an approach for constructing a stability polynomial that 
allows the largest absolutely stable step size for a given problem. 

In the remainder of this section, we review the stability concepts for Runge-Kutta methods and for- 
mulate the stability optimization problem. Our optimization approach, described in Section [2j is based on 
reformulating the stability optimization problem in terms of a sequence of convex subproblems and using 
bisection. We examine the theoretical properties of the proposed algorithm and prove its global convergence 
for two important cases. 

A key element of our optimization algorithm is the use of numerical convex optimization techniques. 
We avoid a poorly conditioned numerical formulation by posing the problem in terms of a polynomial basis 
that is well-conditioned when sampled over a particular region of the complex plane. These numerical 
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considerations, which become particularly important when the number of stages of the method is allowed to 
be very large, are discussed in Section [3j 

In Section [4] we apply our algorithm to several examples of complex spectra. Cases where optimal results 
are known provide verification of the algorithm, and many new or improved results are provided. 

Determination of the stability polynomial is only half of the puzzle of designing optimal explicit Runge- 
Kutta methods. The other half is the determination of the Butcher coefficients. While simply finding 
methods with a desired stability polynomial is straightforward, many additional challenges arise in that 
context; for instance, additional nonlinear order conditions, internal stability, storage, and embedded error 
estimators. The development of full Runge-Kutta methods based on optimal stability polynomials is the 
subject of ongoing work [21)] . 

1.1 The stability polynomial 

A linear, constant-coefficient initial value problem takes the form 

u'{t) = Lu u(0) = u 0) (1) 

where u{t) : R -> and L G R NxN . When applied to the linear IVP 0, any Runge-Kutta method 
reduces to an iteration of the form 



u n = R(hL)u n _x, (2) 

where h is the step size and u n is a numerical approximation to u(nh). The stability function R{z) depends 
only on the coefficients of the Runge-Kutta method [Till Section 4.3] [3 US]- In general, the stability function 
of an s-stage explicit Runge-Kutta method is a polynomial of degree s 

s 

R(z) = Y,a jZ i. (3) 
j=o 

Recall that the exact solution of ([I]) is u(t) = cxp(tL)u . Thus, if the method is accurate to order p, the 
stability polynomial must be identical to the exponential function up to terms of at least order p: 

cij = — for < j < p. (4) 



1.2 Absolute stability 

The stability polynomial governs the local propagation of errors, since any perturbation to the solution will 
be multiplied by R(z) at each subsequent step. The propagation of errors thus depends on which 
leads us to define the absolute stability region 

S = {zeC: \R{z)\ < 1}. (5) 

For example, the stability region of the classical fourth-order method is shown in Figure l(b)| 

Given an initial value problem 0, let A G C denote the spectrum of the matrix L. We say the iteration 
([2]) is absolutely stable if 

hXeS for all A G A. (6) 



2 



Condition ([6| implies that u n remains bounded for all n. More importantly, ^ is a necessary condition for 
stable propagation of error^] Thus the maximum stable step size is given by 

Stable = max{/i > : \R(hX)\ < 1 for A G A}. (7) 

As an example, consider the advection equation 

d d 

—u{x, t) + g^u(x, t) = 0, x G (0, M), 
discretized in space by first-order upwind differencing with spatial mesh size Ax 

UM = - m) \ U " {t) 0<KN 
Ax 

with periodic boundary condition Uo(t) = {7j\r(t). This is a linear IVP with L a circulant bidiagonal 
matrix. The eigenvalues of L are plotted in Figure [1(a) | for Ax = 1, N = M = 20. To integrate this system 
with the classical fourth-order Runge-Kutta method, the time step size must be taken small enough that the 



scaled spectrum {h\i} lies inside the stability region. Figure 1(c) shows the (maximally) scaled spectrum 
superimposed on the stability region. 

The motivation for this work is that a larger stable step size can be obtained by using a Runge-Kutta 
method with a larger region of absolute stability. Figure |l(d) shows the stability region of an optimized 



ten-stage Runge-Kutta method of order four that allows a much larger step size. The ten-stage method was 
obtained using the technique that is the focus of this work. Since the cost of taking one step is typically 
proportional to the number of stages s, we can compare the efficiency of methods with different numbers of 
stages by considering the effective step size h/s. Normalizing in this manner, it turns out that the ten-stage 
method is nearly twice as fast as the traditional four-stage method. 



1.3 Design of optimal stability polynomials 

We now consider the problem of choosing a stability polynomial so as to maximize the step size under 
which given stability constraints are satisfied. The objective function f(x) is simply the step size h. The 
stability conditions yield nonlinear inequality constraints. Typically one also wishes to impose a minimal 
order of accuracy. The monomial basis representation ^ of R(z) is then convenient because the first p + 1 
coefficients {a ,ai, . . . ,a p } of the stability polynomial are simply taken to satisfy the order conditions Q. 
As a result, the space of decision variables has dimension s + 1 — p, and is comprised of the coefficients 
{a p+ i, a p+ 2i • • - > o-s}i a s well as the step size h. Then the problem can be written as 

Problem 1 (stability optimization). Given A C C, order p, and number of stages s, 

maximize h 

a p + l , a p+2 ,-.-s&s)<& 

subject to \R{hX)\ - 1 < 0, VA G A. 

We use i/opt to denote the solution of Problem [T] (the optimal step size) and i? op t to denote the optimal 
polynomial. 

The set A may be finite, corresponding to a finite-dimensional ODE system or PDE semi-discretization, or 
infinite (but bounded) , corresponding to a PDE or perhaps its semi-discretization in the limit of infinitesimal 
mesh width. In the latter case, Problem [T] is a semi-infinite program (SIP). In Section [4] we approach this by 
using a finite discretization of A; for a discussion of this and other approaches to semi-infinite programming, 
see Q3]. 

1 For non-normal L, it may be important to consider the pseudospectrum rather than the spectrum; see Section 4.3 
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Figure 1: (a) spectrum of first-order upwind difference matrix using N — 20 points in space; (b) stability 
region of the classical fourth-order Runge-Kutta method; (c) Scaled spectrum hX with h = 1.39; (d) Scaled 
spectrum hX for optimal 10-stage method with h = 6.54. 



1.4 Previous work 

The problem of finding optimal stability polynomials is of fundamental importance in the numerical solution 
of initial value problems, and its solution or approximation has been studied by many authors for several 

decades 1321 1121 HH1 HZl HH HH1 mi [201 EH1 HH EH SSI [23 EH [H El El HH El EH EH El HI HH ESI EH- 

Indeed, it is closely related to the problem of finding polynomials of least deviation, which goes back to 
the work of Chebyshev. A nice review of much of the early work on Runge-Kutta stability regions can be 
found in [JS]- The most-studied cases are those where the eigenvalues lie on the negative real axis, on the 
imaginary axis, or in a disk of the form \z + w\ < w. Many results and optimal polynomials, both exact 
and numerical, are available for these cases. Much less is available regarding the solution of Problem [T] for 
arbitrary spectra A^. 

Two very recent works serve to illustrate both the progress that has been made in solving these problems 
with nonlinear programming, and the challenges that remain. In |38j . optimal schemes are sought for inte- 
gration of discontinuous Galerkin discretizations of wave equations, where the optimality criteria considered 
include both accuracy and stability measures. The approach used there is based on sequential quadratic 
programming (local optimization) with many initial guesses. The authors consider methods of at most fourth 
order and situations with s — p < 4 "because the cost of the optimization procedure becomes prohibitive for 
a higher number of free parameters." In [21], optimally stable polynomials are found for certain spectra of 
interest for 2 < p < 4 and (in a remarkable feat!) s as large as 14. The new methods obtained achieve a 
40-50% improvement in efficiency for discontinuous Galerkin integration of the 3D Maxwell equations. The 
optimization approach employed therein is again a direct search algorithm that does not guarantee a globally 
optimal solution but "typically converges ... within a few minutes". However, it was apparently unable to 
find solutions for s > 14 or p > 4. The method we present in the next section can rapidly find solutions for 
significantly larger values of s,p, and is provably globally convergent under certain assumptions (introduced 
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in section [2]). 



2 An efficient algorithm for design of globally optimal stability 
polynomials 

Evidently, finding the global solution of Problem [l] is in general quite challenging. Although the Karush- 
Kuhn- Tucker (KKT) conditions provide necessary conditions for optimality in the solution of nonlinear 
programming problems, the stability constraints in Problem[l]are nonconvex, hence suboptimal local minima 
may exist. 

2.1 Reformulation in terms of the least deviation problem 

The primary theoretical advance leading to the new results in this paper is a reformulation of Problem [T] 
Note that Problem [I] is (for s > 2) nonconvex since R(hX) is a nonconvex function in h. 

Instead of asking for the maximum stable step size we now ask, for a given step size h, how small the 
maximum modulus of R(h\) can be. This leads to a generalization of the classical least deviation problem. 

Problem 2 ((Least Deviation)). Given A C C, h £ R + and p,s £ N 

minimize max (|i?(/iA| — 1) . 

We denote the solution of Problem [2] by r PiS (h,A), or simply r(h,A). Note that | -R(-z) | is convex with 
respect to dj, since R(z) is linear in the a,j- Therefore, Problem [2] is convex. Furthermore, Problem [I] can 
be formulated in terms of Problem [2] 

Problem 3 (Reformulation of Problem[T]). Given A C C, and p,s £ N, 

maximize h 

a p + l ,«p+2,...,O s 

subject to r PtS (h,A) < 0. 

2.2 Solution via bisection 

Although Problem [3] is not known to be convex, it is an optimization in a single variable. It is natural then 
to apply a bisection approach, as outlined in Algorithm [TJ 

As long as h max is chosen large enough, it is clear that r(h,A) = for some h £ [h m i n , h max \. Global 
convergence of the algorithm is assured only if the following condition holds: 

r P)8 (fto, A) = => r PtS (h, A) < for all < h < h . (8) 

We now consider conditions under which condition ([8| can be established. We have the following important 
case. 

Theorem 1 (Global convergence when p = 1). Let p = 1, A C C and s > 1. Take /i max large enough so 
that r(/i max , A) > 0. Let H opt denote the solution of Problem^ Then the output of Algorithm^ satisfies 

]imH e = H opt . 
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Algorithm 1 Simple bisection 

^min 

while h max - h min > e do 

II (^max ~\~ ^min)/2 

Solve Problem [2] 

if r PtS (h, A) < then 

^min ^ 

else 

^max ^ 

end if 
end while 
return H e = h m i n 



Proof. Since r(0,A) = < r(/i max , A) and r(h,A) is continuous in h, it is sufficient to prove that condition 
^ holds. We have \R opt (I 
such that oq = a\ — 1 and 



holds. We have |-R op t(-Ho P tA)| < 1 for all A e A. We will show that there exists R^iz) = Ylj=o a i(A*) zJ 



li^GuifoptA)] < 1 VA g A, 0</x<l. 
Let be the coefficients of the optimal polynomial: 

s 

Ropt(z) = l + z + y^QjZ 3 , 

J=2 



and set 
Then 



R„(pH opt X) = 1 + ,iH opt A + J "j!' J "U^ ! =1+HE %#o P t Aj 

= l + ^(i? op t(-ff ptA)- 1), 

where we have defined ai = 1. Define <7a(a0 = Rp,(pH opt X). Then g\(fJ.) is linear in [i and has the property 
that, for A G A, |<7a(0)| = 1 and |<7a(1)| < 1 (by the definition of H opt , R op t)- Thus by convexity \g(^)\ < 1 
for < n < 1. □ 

For p > 1, condition Q does not necessarily hold. For example, take s = p = 4; then the stability 
polynomial ^ is uniquely defined as the degree-four Taylor approximation of the exponential, corresponding 
to the classical fourth-order Runge-Kutta method that we saw in the introduction. Its stability region is 



plotted in Figure |l(b)[ Taking, e.g., A = 0.21 + 2.3i, one finds \R(X)\ < 1 but \R{X/2)\ > 1. Although this 
example shows that Algorithm [I] might formally fail, it concerns only the trivial case s = p in which there 
is only one possible choice of stability polynomial. We have searched without success for a situation with 
s > p for which condition (l8|) is violated. 



G 



2.3 Convergence for starlike regions 



In many important applications the relevant set A is an infinite set; for instance, if we wish to design a 
method for some PDE semi-discretization that will be stable for any spatial discretization size. In this case, 
Problem[I]is a semi-infinite program (SIP) as it involves infinitely many constraints. Furthermore, A is often 
a closed curve whose interior is starlike with respect to the origin; for example, upwind semi-discretizations 
of hyperbolic PDEs have this property. Recall that a region S is starlike if t G S implies /j,t G S for all 
< n < 1. 

Lemma 1. Let A G C be a closed curve passing through the origin and enclosing a starlike region. Let 
r(h,A) denote the solution of Problem^ Then condition ([8j holds. 

Proof. Let A be as stated in the lemma. Suppose r(/iQ,A) = for some ho > 0; then there exists R(z) 
such that \R(h\)\ < 1 for all A G A. According to the maximum principle, the stability region of R(z) must 
contain the region enclosed by A. Choose h such that < h < ho; then hA lies in the region enclosed by A, 
so \R(hX)\ < 1 for A G A. □ 

The proof of Lemma [l] relies crucially on A being an infinite set, but in practice we numerically solve 
Problem [2] with only finitely many constraints. To this end we introduce a sequence of discretizations A n 
with the following properties: 

1. A„ c A 

2. ni < n 2 => A ni C A„ 2 

3. lim,^ oo A„ = A 

4. lim„_ s . 00 v n = where v n denotes the maximum distance from a point in A to the set A„: 

v n = max min I7 — A|. 

7 £A AGA„ 1 ' 1 



For instance, A„ can be taken as an equispaced (in terms of arc-length, say) sampling of n points. 

By modifying Algorithm [T] we can approximate the solution of the semi-infinite programming problem 
for starlike regions to arbitrary accuracy. At each step we solve Problem [2] with A n replacing A. The key 
to the modified algorithm is to only increase h m - la after obtaining a certificate of feasibility. This is done 
by using the Lipschitz constant of R(z) over a domain including hA (denoted by L(R,hA)) to ensure that 
|i?(ft,A)| < 1. The modified algorithm is stated as Algorithm [2j 

The following lemma, which characterizes the behavior of Algorithm [2j holds whether or not the interior 
of A is starlike. 

Lemma 2. Let h^ denote the value of h after k iterations of the loop in Algorithm^ Then either 

• Algorithm^ terminates after a finite time with outputs satisfying r(/i m j n ,A) < 0, r(/i max ,A) > 0; or 

• there exists j < 00 such that r(hP\ A) = and h^ = fob'] for all j > k. 

Proof. First suppose that r(hP\ A) = for some j. Then neither feasibility nor infeasibility can be certified 
for this value of h, so h^ — h^ for all j > k. 

On the other hand, suppose that r(h)- k \ A) 7^ for all k. The algorithm will terminate as long as, for each 
h\ k \ either feasibility or infeasibility can be certified for large enough n. If r(/J fe l,A) > 0, then necessarily 
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Algorithm 2 Bisection for SIP 

^min 

femax = 2s 2 /max|A| 
n = uq 

while h max - h min > e do 

h (^max ^min)/2 

Solve Problem [2] 

if r(h, A n ) < and v n < -2r/L(R, hA) then 

^min — h 

else if r(h,A n ) > then 

^max ^ 

else 

end if 
end while 
return H c = h m i n 



> Bisect 

> Certifies that r(h, A) < 

> Certifies that r(h, A) > 

> -5 < r(h,A n ) < 
[> Reduce the discretization spacing 



r(h [k] ,A n ) > for large enough n, so infeasibility will be certified. We will show that if r(fv- k \ A) < 0, then 
for large enough n the condition 



u n < -2r/L(R,hA) 



(9) 



must be satisfied. Since r(h,A n ) < r(h,A) is bounded away from zero and lim„_ i . 00 v n = 0, d9j) must be 
satisfied for large enough n unless the Lipschitz constant L(R, hA) is unbounded (with with respect to n) 
for some fixed h. Suppose by way of contradiction that this is the case, and let JjW, . . . denote the 
corresponding sequence of optimal polynomials. Then the norm of the vector of coefficients a'- appearing in 
i?M must also grow without bound as i — > oo. By Lemma [3J this implies that |i?W(z)| is unbounded except 
for at most s points z £ C. But this contradicts the condition |i?M(/iA)| < 1 for A € A„ when n> s. Thus, 
for large enough n we must have v n < —2r/L(R, hA). □ 



In practical application, r{h, A) = will not be detected, due to numerical errors; see Section 3.1 For this 



reason, in the next theorem we simply assume that Algorithm [2] terminates. We also require the following 
technical result, whose proof is deferred to the appendix. 



Lemma 3. Let RW,RW 
coefficients of R^ by 



, . . . be a sequence of polynomials of degree at most s (s G N fixed) and denote the 

e C (% e N, < j < s): 



R®{z) 



E 

i=o 



a. z- 



z e 



Further, let 
sequences 



— (do , fflj , • • • , a[ <) T and suppose that the sequence \\afi\\ is unbounded in 
are unbounded for all but at most s points z G C. 



Then the 



Proof. Suppose to the contrary there are s + 1 distinct complex numbers, say, zq, z\, . . . , z s such that the 
vectors r, := (R® (z ), (z{), . . . , R [l] (z s )) T (i G N) are bounded inC s+1 . Let V denote the (s + 1) x (s + 1) 
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Vandermonde matrix whose k th row (0 < k < s + 1) is (1, Zk, zi, . . . , zi ). Then V is invertible and we have 
aM = y _1 rj (i e N), so if I'l denotes the induced matrix norm, then 

110^11 = 11^-^11 < my- 1 ! ||n||. 

But, by assumption, the right hand side is bounded, whereas the left hand side is not. □ 

Theorem 2 (Global convergence for strictly starlike regions). Let A be a closed curve that encloses a region 
that is starlike with respect to the origin. Suppose that Algorithm^ terminates for all small enough e, and 
let H e denote the value returned by Algorithm^ for a given e. Let H opt denote the solution of Problem^ 
Then 

limife = H opt . 

Proof. Due to the assumptions and Lemma [2j we have that r(/i m i n , A) < < r(h max ,A). Then Lemma [I] 
implies that h m - m < H opt < h mllx . Noting that also h max — h mm < e, the result follows. □ 

Despite the lack of a general global convergence proof, Algorithm [I] works very well in practice even for 
general A when p > 1. In all cases we have tested and for which the true H opt is known (see Section |4|, 
Algorithm [T] appears to converge to the globally optimal solution. Furthermore, Algorithm [l] is very fast. 
For these reasons, we consider the (much slower) Algorithm [2] to be of primarily theoretical interest, and we 
base our practical implementation on Algorithm [T] 



3 Numerical implementation 

We have made a prototype implementation of Algorithm [T] in Matlab. The implementation relies heavily 
on the CVX package JTSJ, [TT], a MATLAB-based modeling system for convex optimization, which in turn relies 
on the interior-point solvers SeDuMi and SDPT3 [H]. The least deviation problem (Problem [2]) can be 
succinctly stated in four lines of the CVX problem language, and for many cases is solved in under a second 
by either of the core solvers. 



Our implementation re-attempts failed solves (see Section 3.2) with the alternate interfaced solver. In 
our test cases, we observed that the SDPT3 interior-point solver was slower, but more robust than SeDuMi. 
Consequently, our prototype implementation uses SDPT3 by default. 

Using the resulting implementation, we were able to successfully solve problems to within 0.1% accuracy 
or better with scaled eigenvalue magnitudes \hX\ as large as 4000. As an example, comparing with results 
of [5] for spectra on the real axis with p = 3, s = 27, our results are accurate to 6 significant digits. 



3.1 Feasibility threshold 

In practice, CVX often returns a small positive objective (r « 10~ 7 ) for values of h that are just feasible. 
Hence the bisection step is accepted if r < e where e« 1. The results are generally insensitive (up to the 
first few digits) to the choice of e over a large range of values; we have used e = 10~ 7 for all results in this 
work. The accuracy that can be achieved is eventually limited by the need to choose a suitable value e. 



3.2 Conditioning and change of basis 

Unfortunately, for large values of hX, the numerical solution of Problem [2] becomes difficult due to ill- 
conditioning of the constraint matrix. Observe from pi) that the constrained quantities R(h\) are related 
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to the decision variables dj through multiplication by a Vandermonde matrix. Vandermonde matrices are 
known to be ill-conditioned for most choices of abscissas. For very large hX, the resulting CVX problem cannot 
be reliably solved by either of the core solvers. 

A first approach to reducing the condition number of the constraint matrix is to rescale the monomial 
basis. We have found that a more robust approach for many types of spectra can be obtained by choosing 
a basis that is approximately orthogonal over the given spectrum {A}. Thus we seek a solution of the form 

s 3 

R(z) =y2djQj(z) where Qj{z) = ^b jk z k . (10) 

j=0 fc=0 

Here Qj{z) is a degree- j polynomial chosen to give a well-conditioned constraint matrix. The drawback of 
not using the monomial basis is that the dimension of the problem is s + 1 (rather than s + 1 — p) and we 
must now impose the order conditions explicitly: 

s 1 

^2 a j b jk = ^ for fc = 0,l,...,p. (11) 

3=0 

Consequently, using a non-monomial basis increases the number of design variables in the problem and 
introduces an equality constraint matrix B g R pxs that is relatively small (when p « s), but usually very 
poorly conditioned. However, it can dramatically improve the conditioning of the inequality constraints. 

The choice of the basis Qj(z) is a challenging problem in general. In the special case of a negative real 
spectrum, an obvious choice is the Chebyshev polynomials (of the first kind) Tj, shifted and scaled to the 
domain [hx, 0] where x = minAeA Re(A), via an affine map: 

Q ^= T ^{ l+ Vx)- (12) 

The motivation for using this basis is that \Qj(hX)\ < 1 for all A € [hx, 0]. This basis is also suggested by the 
fact that Qj(z) is the optimal stability polynomial in terms of negative real axis inclusion for p = 1, s = j. In 
Section|4j we will see that this choice of basis works well for more general spectra when the largest magnitude 
eigenvalues lie near the negative real axis. 

As an example, we consider a spectrum of 3200 equally spaced values A in the interval [—1,0]. The 
exact solution is known to be h = 2s 2 . Figure [2] shows the relative error as well as the inequality constraint 
matrix condition number obtained by using the monomial ^ and Chebyshev ( 12 ) bases. Typically, the 



solver is accurate until the condition number reaches about 10 16 . This supports the hypothesis that it is 
the conditioning of the inequality constraint matrix that leads to failure of the solver. The Chebyshev basis 
keeps the condition number small and yields accurate answers even for very large values of h. 



3.3 Choice of initial upper bound 

The bisection algorithm requires as input an initial /i max such that r(ft, max , A) > 0. Theoretical values can 
be obtained using the classical upper bound of 2s 2 jx if A encloses a negative real interval [x, 0], or using the 
upper bound given in [33J if A encloses an ellipse in the left half-plane. Alternatively, one could start with 
a guess and successively double it until r(/i max , A) > is satisfied. Since evaluation of r(h, A) is typically 
quite fast, finding a tight initial /i max is not an essential concern. 
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Figure 2: Condition number of principal constraint matrix and relative solution accuracy versus optimal 
step size. The points along a given curve correspond to different choices of s. 



4 Examples 

We now demonstrate the effectiveness of our algorithm by applying it to determine optimally stable poly- 
nomials (i.e., solve Problem [lj for various types of spectra. As stated above, we use Algorithm [T] for its 
simplicity, speed, and effectiveness. When A corresponds to an infinite set, we approximate it by a fine 
discretization. 



4.1 Verification 

In this section, we apply our algorithm to some well-studied cases with known exact or approximate results 
in order to verify its accuracy and correctness. In addition to the real axis, imaginary axis, and disk cases 
below, we have successfully recovered the results of [28 . Our algorithm succeeds in finding the globally 
optimal solution in every case for which it is known, except in some cases of extremely large step sizes for 
which the underlying solvers (SDPT3 and SeDuMi) eventually fail. 



4.1.1 Negative real axis inclusion 

Here we consider the largest h such that [— h, 0] <E S by taking A = [—1, 0]. This is the most heavily studied 
case in the literature, as it applies to the semi-discretization of parabolic PDEs and a large increase of H opt 
is possible when s is increased (see, e.g., [3H H31 H7J IS1 EH] ) • For first-order accurate methods (p — 1), the 
optimal polynomials are just shifted Chebyshev polynomials, and the optimal timestep is H opt — 2s 2 . Many 
special analytical and numerical techniques have been developed for this case; the most powerful seems to 
be that of Bogatyrev [6] . 

We apply our algorithm to a discretization of A (using 6400 evenly-spaced points) and using the shifted 
and scaled Chebyshev basis (12). Results for up to s — 40 are shown in Table [l] (note that we list H opt /s 2 for 



easy comparison, since H opt is approximately proportional to s 2 in this case). We include results for p = 10 
to demonstrate the algorithm's ability to handle high-order methods. For p — 1 and 2, the values computed 
here match those available in the literature [42]. Most of the values for p = 3,4 and 10 are new results. 
Figure [3] shows some examples of stability regions for optimal methods. As observed in the literature, it 
seems that H opt /s 2 tends to a constant (that depends only on p) as s increases. For large values of s, some 
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353 
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000 
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132 


40 


2 


000 





821 


0.500 





355 





132 



Table 1: Scaled size of real axis interval inclusion for optimized methods. 

results in the table have an error of about 10~ 3 due to inaccuracies in the numerical results provided by the 
interior point solvers. 

4.1.2 Imaginary axis inclusion 

Next we consider the largest h such that [— ih, ih] <E S by taking A = xi, x € [—1, 1]. Optimal polynomials for 
imaginary axis inclusion have also been studied by many authors, and a number of exact results are known 
or conjectured [HI 251 HOI I2U HH S3] ■ We again approximate the problem, taking TV = 3200 evenly-spaced 
values in the interval [0, i] (note that stability regions are necessarily symmetric about the real axis since 
R{z) has real coefficients). We use a "rotated" Chebyshev basis defined by 

<M,)-rti(£). 

where x = maxj(| Im(Ai)|). Like the Chebyshev basis for the negative real axis, this basis dramatically 
improves the robustness of the algorithm for imaginary spectra. Table [2] shows the optimal effective step 
sizes. In agreement with [321 HI]; we find H = s — 1 for p = 1 (all s) and for p = 2 (s odd). We also 
find H = s — 1 for p = 1 and s even, which was conjectured in [45] and confirmed in |43j . We find 
H op t — \J s(s — 2) for p = 2 and s even, strongly suggesting that the polynomials given in [5D] are optimal 
for these cases; on the other hand, our results show that those polynomials, while third order accurate, are 
not optimal for p = 3 and s odd. Figure [4] shows some examples of stability regions for optimal methods. 

4.1.3 Disk inclusion 

In the literature, attention has been paid to stability regions that include the disk 

D(h) = {z:\l + z/h\<l}, (13) 
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(b) p = 10, s = 20 

Figure 3: Stability regions of some optimal methods for real axis inclusion. 
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Table 2: Scaled size of imaginary axis inclusion for optimized methods. 
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(a) p = 1, s = 7 (b) p = 4, s = 7 

Figure 4: Stability regions of some optimal methods for imaginary axis inclusion. 



for the largest possible h. As far as we know, the optimal result for p = 1 (H opt = s) was first proved in 
[IB] . The optimal result for p = 2 (i? pt = s — 1) was first proved in |45) . Both results have been unwittingly 
rediscovered by later authors. For p > 2, no exact results are available. 
We use the basis 

z N 



,•(*) 



1 



Note that Qj(z) is the optimal polynomial for the case s — j , p = 1. This basis can also be motivated 
by recalling that Vandermonde matrices are perfectly conditioned when the points involved are equally 
spaced on the unit circle. Our basis can be obtained by taking the monomial basis and applying an affine 



transformation that shifts the unit circle to the disk (13). This basis greatly improves the robustness of the 
algorithm for this particular spectrum. We show results for p < 4 in Figure [5] For p = 3 and s = 5, 6, our 
results give a small improvement over those of |19j . Some examples of optimal stability regions are plotted 
in Figure [6j 



4.2 Spectrum with a gap 

We now demonstrate the effectiveness of our method for more general spectra. First we consider the case 
of a dissipative problem with two time scales, one much faster than the other. This type of problem was 
the motivation for the development of projective integrators in Following the ideas outlined there we 
consider 



A = {z : \z\ = l,R(z) < 0} U {z : \z - a\ = 1}. 



(14) 



We take a = 20 and use the shifted and scaled Chebyshev basis (12). Results are shown in Figure [7] A 
dramatic increase in efficiency is achieved by adding a few extra stages. 
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Figure 5: Relative size of largest disk that can be included in the stability region (scaled by the number of 
stages). 
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Figure 6: Stability regions of some optimal methods for disk inclusion. 
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Figure 7: Optimal methods for spectrum with a gap ( 14 ) with a = 20. 



4.3 Legendre pseudospectral discretization 

Next we consider a system obtained from semidiscretization of the advection equation on the interval [—1,1] 
with homogeneous Dirichlet boundary condition: 



u{t,x = 1) = 0. 



The semi-discretization is based on pseudospectral collocation at points given by the zeros of the Legendre 
polynomials; we take N = 50 points. The semi-discrete system takes the form ([!]), where L is the Legendre 

,)J We compute an optimally stable poly- 
= 1. The stability region of the resulting 



differentiation matrix, whose eigenvalues are shown in Figure 
nomial based on the spectrum of the matrix, taking s = 7 and p 



method is plotted in Figure 8(c) 
stability region 



Using an appropriate step size, all the scaled eigenvalues of L lie in the 

(e)| shows 



However, this method is unstable in practice for any positive step size; Figure 
an example of a computed solution after three steps, where the initial condition is a Gaussian. The resulting 
instability is non-modal, meaning that it does not correspond to any of the eigenvectors of L (compare [4*01 
Figure 31.2]). 

This discretization is now well-known as an example of non- normality [40j Chapters 30-32]. Due to the 
non-normality, it is necessary to consider pseudospectra in order to design an appropriate integration scheme. 
The e-pseudospectrum (see [40 ) is the set 

{zeC: Wiz-D)- 1 ]] >l/e}. 



The e-pseudospectrum (for e = 2) is shown with the eigenvalues in Figure |8(b)[ The instability observed 
above occurs because the stability region does not contain an interval on the imaginary axis about the origin, 
whereas the pseudospectrum includes such an interval. 

We now compute an optimally stable integrator based on the 2-pseudospectrum. This pseudospectrum 
is computed using an approach proposed in [391 Section 20], with sampling on a fine grid. In order to reduce 
the number of constraints and speed up the solution, we compute the convex hull of the resulting set and 
apply our algorithm. The resulting stability region is shown in Figure |8(d)[ It is remarkably well adapted; 
notice the two isolated roots that ensure stability of the modes corresponding to the extremal imaginary 
eigenvalues. We have verified that this method produces a stable solution, in agreement with theory (see 
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Chapter 32 of [ID] ) ; Figure |8(f)| shows an example of a solution computed with this method. The initial 
Gaussian pulse advects to the left. 



4.4 Thin rectangles 

A major application of explicit Runge-Kutta methods with many stages is the solution of moderately stiff 
advection-reaction-diffusion problems [151 44 . For such problems, the stability region must include not only 
a large interval on the negative real axis, but also some region around it, due to convective terms. If centered 
differences are used for the advective terms, it is natural to require that a small interval on the imaginary 
axis be included. Hence, one may be interested in methods that contain a rectangular region 

A K = {A G C : —j3 < Im(A) < /3, -k < Re (A) < 0}. (15) 

for given k, /3. No methods optimized for such regions appear in the literature, and the available approaches 
for devising methods with extended real axis stability (including those of [57] ) cannot be applied to such 
regions. Because of this, previously existing methods are applicable only if upwind differencing is applied to 
convective terms [3H [37] . 

For this example, rather than parameterizing by the step size h, we assume that a desired step size h 
and imaginary axis limit f3 are given based on the convective terms, which generally require small step sizes 
for accurate resolution. We seek to find (for given s,p) the polynomial ([3| that includes A K for k as large 
as possible. This could correspond to selection of an optimal integrator based on the ratio of convective and 
diffusive scales (roughly speaking, the Reynolds number). Since the desired stability region lies relatively 



near the negative real axis, we use the shifted and scaled Chebyshev basis (12 1. 

Stability regions of some optimal methods are shown in Figure [9] The outline of the included rectangle 
is superimposed in black. The stability region for /3 — 10, s = 20, shown in Figure [9] is especially interesting 
as it is very nearly rectangular. A closeup view of the upper boundary is shown in Figure [lOj 

5 Discussion 

The approach described here can speed up the integration of IVPs for which 

• explicit Runge-Kutta methods are appropriate; 

• the spectrum of the problem is known or can be approximated; and 

• stability is the limiting factor in choosing the step size. 

Although we have considered only linear initial value problems, we expect our approach to be useful in 
designing integrators for nonlinear problems via the usual approach of considering the spectrum of the 
Jacobian. A first successful application of our approach to nonlinear PDEs appears in [55] . 

The amount of speedup depends strongly on the spectrum of the problem, and can range from a few 
percent to several times or more. Based on past work and on results presented in Section [4j we expect 
that the most substantial gains in efficiency will be realized for systems whose spectra have large negative 
real parts, such as for semi-discretization of PDEs with significant diffusive or moderately stiff reaction 
components. As demonstrated in Section [4j worthwhile improvements may also be attained for general 
systems, and especially for systems whose spectrum contains gaps. 

The work presented here suggests several extensions and areas for further study. For very high polynomial 
degree, the convex subproblems required by our algorithm exhibit poor numerical conditioning. We have 
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Figure 8: Results for the Legendre differentiation matrix with N = 50. 
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Figure 9: Stability regions of some optimal methods for thin rectangle inclusion. 




Figure 10: Closeup view of upper boundary of the rectangular stability region plotted in Figure [9j 

proposed a first improvement by change of basis, but further improvements in this regard could increase the 
robustness and accuracy of the algorithm. It seems likely that our algorithm exhibits global convergence in 
general circumstances beyond those for which we have proven convergence. The question of why bisection 
seems to always lead to globally optimal solutions merits further investigation. While we have focused 
primarily on design of the stability properties of a scheme, the same approach can be used to optimize 
accuracy efficiency, which is a focus of future work. Our algorithm can also be applied in other ways; for 
instance, it could be used to impose a specific desired amount of dissipation for use in multigrid or as a kind 
of filtering. 

We remark that the problem of determining optimal polynomials subject to convex constraints is very 
general. Convex optimization techniques have already been exploited to solve similar problems in filter 
design 8 , and will likely find further applications in numerical analysis. 

Acknowledgments. We thank Lajos Loczi for providing a simplification of the proof of Lemma [3] We 
are grateful to R.J. LeVeque and L.N. Trefethen for helpful comments on a draft of this work. 
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